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ABSTRACT 

In this paper various flip-flop structures have been 
studied. In all designs to reduce power consumption, 
the Pulse Generator circuitry should be in build along 
with the flip-flop itself. If a pulse generator is 
included along with DPSCRFF structure, power 
consumption can be reduced. In this work a new 
design of flip-flop, Double Pulse Latch Flip-flop 
(DPLFF) is proposed. DPLFF eliminates unnecessary 
glitches, which consume more power. DPLFF 
consume less power for same delay as compared with 
other existing techniques, which is performing one of 
the fastest known flip-flops. In serial operation as 
shift register the proposed DPLFF can perform better 
at the higher frequency. The stacking of transistor in 
the latch stage cause reduction in subthreshold 
leakage current & thus the static power consumption 
is also less for DPLFF. This is better suited for low 
power circuits at deep submicron technology where 
leakages are more dominant. 

Keywords: DPLFF, Delay, Power Consumption, 
Speed, Latch, Flip-flop, Pulse Generator 

I. INTRODUCTION 

The relentless drive toward smaller, faster, and 
cheaper computing systems has, in large part, been 
enabled by exponential increases in device density 
and operating frequency through VLSI technology 
scaling. This, however, has led to exponential 
increases in power consumption that has reached the 
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limits of reliability and cost effective cooling. Also, 
the continuous scaling into the area of nanometer 
regime introduced issues like signal integrity, design 
robustness, soft error, & environmental & process 
variability. Furthermore, the issues of power 
consumption & robustness only get worse with time. 
This has created, therefore, a crisis in computer 
system design that threatens to be a stumbling block 
to future advancement. Designers of leading-edge 
computing systems, at any scale, are finding that 
power consumption & design robustness are first class 
constraints, & must be taken into account at every 
level of design. 

Flip-Flops are critical timing elements in digital 
circuits & have a large impact on circuit speed & 
power consumption. Consequently, extensive research 
has been performed to develop fast & low-power flip- 
flops [1-5] The primary measure of performance of a 
flip-flop is the minimum Dto- Q delay [1], as it tell 
about the speed of operations of the flip-flop in each 
cycle time. Recently, pulsed latch structures have 
emerged as the fastest known flip-flop structures [1], 
[2], By reducing the transparency period of a latch to 
a narrow window, the latch can operate as a flip-flop 
with the additional advantage of allowing limited time 
borrowing across cycle boundaries to reduce 
sensitivity to clock skew & jitter. In this process we 
have designed a new flip-flop called Doubled Pulsed 
Latch Flip Flop (DPLFF), which is better in terms of 
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power & performance as comparison with Double 
Pulsed Set Conditional Reset Flip Flop (DPSCRFF). 

II. BACKGROUND - SCALING & THE 
CHALLENGES FOR FUTURE COMPUTING 

Integrated circuit technology has advanced 
tremendously over the past 40 years, as predicted by 
Moore’s Law [5]. Device counts have grown 
exponentially, from the 2300 transistors of the Intel 
4004 processor in 1971, to the 1700 million 
transistors of the Intel Dual core Itanium 2 processor 
in 2006. Simultaneously, clock frequencies have 
increased exponentially from 0.1MHz in the Intel 
4004 to 3.8Ghz in Intel Pentium 4’s. Historically, & 
according to predictions in the International 
Technology Roadmap for Semiconductors (ITRS) [6], 
each technology generation, which occur at 2.5-3 
year intervals, brings with it a 0.7x scaling in drawn 
gate length as well as other layout geometry lengths. 
The physical gate length follows the same 0.7 X 
scaling. Assuming a constant die size, this means a 2x 
scaling in device count & a 1.4x scaling in total 
transistor width. In addition, the intrinsic switching 
speed of a transistor increases at roughly 1.5x per 
generation. On the other hand, power consumption 
has been increasing at 20% per year & has reached 
power density limits. At the same time, noise, from 
many sources, as a fraction of power supply voltage, 
has increased while noise sensitivity has also 
increased. These factors, together with increased 
relative process variation & environmental variation, 
have made predictability & robustness difficult to 
achieve in new designs. This chapter explains the 
connection between scaling & power consumption & 
design robustness. 

III. POWER CONSUMPTION 

Power has always been one of the foremost issues in 
system design. Also, power dissipation, device 
performance, functionality, battery life, cost, & size 
are dependent to each of them. A hand-held device, 
for example, must be small. There is, therefore, no 
room for a fan or a large battery. Similarly, a personal 
computer should be inexpensive; few are willing to 
pay for exotic cooling technologies. In fact, 15 high 
performance processors have already reached the 
power density limit for cost-effective cooling. All 
these things limit the amount of power a processing 
chip can bum. The costs of power dissipation extend 
beyond the power used for computing. Take a data 
center for example. Firstly, there is, of course, the 


electricity bill from the computers. Secondly, there is 
the electricity bill & maintenance for the air 
conditioning system, which has to remove the heat 
due to power dissipation. Finally, thermal concerns 
dictate a maximum power density of a system; in 
other words, the more power a system bums, the more 
space it must occupy. Therefore, we must add in the 
rent for the space occupied by the system. In all, one 
account calculates power dissipation at 25% of the 
total cost of a data center [7]. Chip power can be 
divided into two main components: dynamic 
switching & static leakage. Dynamic power 
dissipation, ignoring short-circuit current which is 
usually a small fraction of total dynamic power, is 
given by P = 1/ 2fCV 2, where C is the average total 
on-chip capacitance switched per cycle. Ever since the 
0.5 pm generation, the gate dielectric oxide thickness, 
supply voltage, & threshold voltage have scaled with 
device dimensions by 0.7x per generation to limit the 
growth of dynamic power consumption while 
improving performance. This, however, is only half 
the power story. The reduction of oxide thickness & 
threshold voltage has led to exponential increases in 
static leakage power. There are six leakage 
mechanisms in nanometer scale transistors [8], of 
which the three most significant are subthreshold 
leakage, gate leakage, & band-to-band tunneling 
(BTBT) leakage [9]. These are indicated in Figure 2- 
1. Subthreshold leakage is the current flowing from 
drain to source (or vice versa) when the transistor is 
nominally off. This current is inversely exponentially 
proportional to the transistor’s threshold voltage & 
has therefore grown exponentially. Gate leakage is the 
current flowing from the gate to the source, drain, or 
bulk (or vice versa). This is caused by direct tunneling 
of electrons or holes through the oxide insulator. This 
current is inversely exponentially proportional to the 
transistor’s oxide thickness, leading to the exponential 
increase in gate leakage. Band-to-band tunneling is 
the current flowing through the reverse-biased 
drain/substrate & source/substrate junctions. This 
current is exponentially proportional to the doping 
concentrations on either side of the junction, which 
have also increased in scaled devices, leading to the 
exponential increase in BTBT leakage. Subthreshold 
leakage was the major component of total leakage at 
technologies larger than 130nm (drawn gate length). 

IV. LITERATURE REVIEW 

[1] Madhav Dasari et-al, “Novel Ultra Low Power 
Dual Edge Triggered Retention Flip-flop for 
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Transiently Powered Systems”, IEEE 7th 
International Advance Computing Conference, 2017. 

The paper signifies the need for an improved memory 
unit that preserves the state of operation for 
Transiently Powered Systems by offering reduction in 
the latency for retention and enhance energy 
efficiency. The authors provided quantitative analysis 
of different DETFF designs with a retention latch 
along with SETFF. The designs are evaluated by 
applying clock and power gating for various levels of 
voltage operation to verify the suitability for energy 
harvesting application. A DETFF design based on 
dynamic design is proposed that excels in all the three 
levels i.e. super threshold, sub-threshold & super + 
sub-threshold when compared with existing DETFF 
designs. 

[2] Andrea Bonetti et-al, “An Overlap-Contention 
Free True-Single-Phase Clock Dual-Edge-Triggered 
Flip-Flop”, IEEE International Symposium on 
Circuits and Systems (ISCAS), 2015. 

This paper presented a novel dual-edge-triggered flip- 
flop topology to solve the inherent clock-overlap risk 
in the majority of the previously presented DETFFs. 
The failure risk due to clock-overlap was 
demonstrated on a popular DET-TGLM gate, showing 
an unacceptable error-rate at near-threshold voltages. 
The proposed fully-static true single-clock-phase 
DETFF was shown to be fully functional at a similar 
operating point, under local and global process 
variations and at a wide range of temperatures. In 
addition, the proposed cell was found to provide the 
best performance and energy-efficiency among static 
DET-FF options. 

[3] Nitin Kumar Saini et al, “Low Power Dual Edge 
Triggered Flip-Flop”, IEEE Transactions, Vol.6, Pp- 
978-985, 2014. 

In this paper a new technique for pulse generation 
circuit of Dual Edge Triggered Flip-Flop (DETFF) for 
low power is presented, which enables the flip-flop to 
be operated at 1.2V. By the introduction of a new fast 
latch & applying conditional pre-charging, dual-edge 
triggered flip-flop becomes capable of obtaining low 
power consumption that has smaller delay. Clock¬ 
gated sense-amplifier is incorporated to reduce power 
consumption at low switching activity. Simulation 
were done in 90 nm process technology. The 
advantages of this design are less latency & low 
power consumption & drawbacks is this design is a 
bulky design. 


[4] Kalarikkal Absel et al, “Low-Power Dual 
Dynamic Node Pulsed Hybrid Flip-Flop Featuring 
Efficient Embedded Logic”, IEEE Transactions On 
Very Large Scale Integration (VLSI) Systems, Vol. 
21, No. 9, September 2013. 

In this paper, authors presented two designs of dual 
edge flip-flops named as, Dual Dynamic Node Hybrid 
Flip-Flop (DDFF) & Embedded Logic Module 
(DDFF-ELM) which is based on DDFF. The proposed 
designs eliminate the large capacitance present in the 
precharge node of several state-of-the-art designs by 
following a split dynamic node structure to separately 
drive the output pull-up and pull-down transistors. 
Simulations were done in 90 nm process technology. 
It is well suited for modem high performance designs 
where area, delay-overhead, and power dissipation are 
of major concern. But this design needed an analysis 
of the overlap period required to select proper pulse 
width. 

[5] Helga Evangelene et al, “A Novel Low Power 
Hybrid Flipflop using Sleepy Stack Inverter Pair”, 
IEEE Science and Information Conference-14, 
London, UK, August, 2014. 

This paper presents a low power hybrid flip flop using 
sleepy stack inverter pair for retaining the logic level 
till the end of evaluation and pre-charge phase of the 
flip flop. The sleepy stack inverter pairs are efficient 
in leakage power reduction and overall power 
dissipation as the technology scales down to 90nm 
and below. The performance of the proposed flip flop 
was compared with the conventional Dual Dynamic 
Node Pulsed Hybrid Flip Flop (DDNPHFF). This 
design has reduced leakage power with other designs. 
As the leakage power depends on the input vectors, 
when the system is idle low leakage input vectors has 
to be fed as input to the system. The speed of the flip 
flop is maintained in the proposed flip flop as like in 
conventional one. The area is the only trade off, i.e. it 
is a bulky design. 

PROPOSED METHODOLOGY 

In design of DPLFF pulse generator is modified, & 
the Latch part also modified to eliminates undesired 
glitch, extra dynamic power, which is caused by 
preset, & static power. 

6.2 DPLFF DESIGN 

The two pulses are generated by a local pulse 
generator to avoid pulse distortions from additional 
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pulse buffers & wiring. The pulse generator can be 
shared by a few neighboring flip-flops to reduce pulse 
generator area & energy overheads. The width of the 
pulses is controlled by the inverter delay chain. The 
inverters in the chain can be skewed to control the 
lengths of pi & p2. The width of p2 determines the 
transparency window of the latch. To reduce setup & 
hold time requirements, p2 should be made as small 
as possible. However, if p2 is too short, the circuit 
will not function. Detailed simulation at all process 
comers & careful control of clock pulse loading will 
ensure proper functionality. 

The conventional way to generate a pair of pulses 
uses an inverter delay chain as in Fig. 4.10 (a). This 
design has a large number of intermediate nodes & 
thus dissipates a significant amount of energy. Our 
alternative design reduces the number of intermediate 
nodes by using an inverter delay chain both to 
generate p2 & to turn off pi. 

Figure 6.1 shows the design of the DPLFF. The 
DPLFF consist a static latch & pulse generator. Static 
latch required two pulses pi & p2, which are 
generated from the active clock edge. When data in 0 
PM2 is on & NM1 is off, the pulse pi will be low for 
small amount of time with in that time output node 
will be charged. When the input data is ‘ 1’ PM2 is off 
& NM1 is on pulse p2 will be high for small time & 
within that time output node will be discharged. In the 
Figure 6.2 operation of DPLFF is shown, whenever 
the output node is to be in low sate it remains low 
unlike DPSCRFF because of PM2 is off. 



Figure 1: DPLFF using Novel Pulse Generation 

The DPLFF has symmetric properties pulse pi low & 
p2 high almost at same time the input should be 
should not change during this time. But the width of 
the pulse pi & p2 is less than that of the DPSCRFFs 
pulse pi & p2. In DPLFF NMOS in the clock 
generator is going to switch on after one inverter 


delay where as in DPSCRFF after 3 inverter delay. 
This makes the DPLFFs pulse (pi & p2) width less 
than that of DPSCRFFs pulse (pi & p2). The 
propagation delay of required in the combinational 
circuit is also less than DPSCRFF. The maximum 
clock frequency also increases in shift register when 
DPLFF is used. 





Figure 2: DPLFF Operation Waveforms 



Figure 3: DPLFF Delay Analysis 


V. SIMULATION RESULTS 

The simulation of the proposed design is done using 
transient analysis using PTM 65nm in HSPICE at 
1GHZ clock frequency the results are tabulated in 
Table 1 & Table 2. DPLFF is designed & simulated 
by using multi threshold, in the latch part we used low 
Vth & high V t h everywhere else. The simulation result 
shows that DPLFF performance is better than other 
existing techniques in terms of power dissipation in 
all situations. The static power reduction is obvious 
when clock & data are set to zero. The static power 
saving is considerable & considering the fact that 
many flip-flops are used in circuit over all power 
saving will be large for nanometer designs. The power 
reduction is achieved by removing two inverters, 
where static power also reduced in latch by stacking 
two PMOS (PM1, PM2). Overall power dissipation is 
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reduced as compared with DPSCRFF. From the 
timing analysis shows that DPLFF requires less time 
when output is changed from ‘1’ to ‘O’. However, 
DPSCRFF delay is zero to change the output from ‘0’ 
to T because preset of output node before arrival of 
input. However, the clock frequencies required for 
flip-flop will be decided by logic level changes from 
either side. Thus the operational frequencies for 
DPLFF will be higher compared to DPSCRFF [18]. 


Conditio a 

Proposed DPLFF 

1. Output is altering fi l* to "0 s 

2.7S8u watt 

2. Output is altering fi 0 s to "I s 

2.832 u watt 

3. Output is constant * 0* 

2.163u watt 

4. Output is constant fi l s 

L.92Gu watt 

5. Clk =’0* D=’0 S 

0.167Gu watt 


Table 1: Power consumption of Proposed DPLFF 


Condition 

Delay (p-sec) 

Average Delay 

CLK-Qfe 

(p-sec) 

Output changes '1' to r 0 f T p£ _ H 

11.19 

12.56 

Output changes '0' to f l T T pfl _ L 

13.94 


Table 2: Delay Performance of Proposed DPLFF 


Flip-flop 

Number of Transistors 

Data driving Power (jftSSO 

5PFF 

23 

4.56 

HLFF 

20 

5.21 

PowerPC 603 

22 

16.13 

CBMFF 

22 

13.30 

DBFF-ELM [4] 

22 

3.18 

Proposed DPLFF 

16 

2.788 


Table 3 Comparison of Power Consumption 


Power Comparison 

15 



«5DFF -HLFF pCMF ■DDFFILMtfl ■EWsHEPLFF 

Figure 4 Graph of Power Performance & No. of 
transistors 

From the table 3 it can be clearly seen that this 
proposed design has less number of gates used, due to 


which the average power consumption is also 
reduced, as compared to other designs. This design 
has improved power consumption performance with 
[4], in which 22 transistors were used, but if 
compared with the ratio of number of transistors & 
power consumption, the proposed design is 
consuming more power. 


%ikini- +if 1 liv rUfli-ir 

DtRayfr-Pk) 

AwwMiy CLK- 

tfb(p-iK) 

fpfrt-FT 

TpH-l 

SETFf |1) 

7 5.6* 

92.1 ii 

8-1.91 

DUFF 1 

1712 

L67.5 

149 4 

PI TFT m 

53.27 

5623 

57.2* 

PPFTFF 

71J3 


Sl.W 

PrapMMi SfLPT 

H 39 

LJ5J 

Ij.-Sfi 


Table 6.4 Comparison of Delay Performance 

The proposed design has better delay performance 
with respect to other existing designs. It takes only 
11.19 p-sec when Output changes T to 'O' i.e. T pL _ H 
& 13.94 p-sec when Output changes 'O' to T pH - L . 
The average delay is 12.56 p-sec for the proposed 
design. 

DELAY COMPARISON 

i Delay TpL-H (p-sec) ■ Delay TpH-L (p-sec) i Average Delay CLK-Qb 

180 



Figure 5: Graph of Delay Performance 


VI. CONCLUSION 

In this paper various flip-flop structures has been 
studied. In all designs to reduce power consumption, 
the Pulse Generator circuitry should be in build along 
with the flip-flop itself. If a pulse generator is 
included along with DPSCRFF structure, power 
consumption can be reduced In this work a new 
design of flip-flop, Double Pulse Latch Flip-flop 
(DPLFF) is proposed. DPLFF eliminates unnecessary 
glitches, which consume more power. DPLFF 
consume less power for same delay as compared with 
other existing techniques, which is performing one of 
the fastest known flip-flops. In serial operation as 
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shift register the proposed DPLFF can perform better 
at the higher frequency. The stacking of transistor in 
the latch stage cause reduction in subthreshold 
leakage current & thus the static power consumption 
is also less for DPLFF. This is better suited for low 
power circuits at deep submicron technology where 
leakages are more dominant 
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